Synthesizing Products for Online Catalogs

نویسندگان

  • Hoa Nguyen
  • Ariel Fuxman
  • Stelios Paparizos
  • Juliana Freire
  • Rakesh Agrawal
چکیده

A comprehensive product catalog is essential to the success of Product Search engines and shopping sites such as Yahoo! Shopping, Google Product Search, and Bing Shopping. Given the large number of products and the speed at which they are released to the market, keeping catalogs up-to-date becomes a challenging task, calling for the need of automated techniques. In this paper, we introduce the problem of product synthesis, a key component of catalog creation and maintenance. Given a set of offers advertised by merchants, the goal is to identify new products and add them to the catalog, together with their (structured) attributes. A fundamental challenge in product synthesis is the scale of the problem. A Product Search engine receives data from thousands of merchants about millions of products; the product taxonomy contains thousands of categories, where each category has a different schema; and merchants use representations for products that are different from the ones used in the catalog of the Product Search engine. We propose a system that provides an end-to-end solution to the product synthesis problem, and addresses issues involved in data extraction from offers, schema reconciliation, and data fusion. For the schema reconciliation component, we developed a novel and scalable technique for schema matching which leverages knowledge about previously-known instance-level associations between offers and products; and it is trained using automatically created training sets (no manually-labeled data is needed). We present an experimental evaluation using data from Bing Shopping for more than 800K offers, a thousand merchants, and 400 categories. The evaluation confirms that our approach is able to automatically generate a large number of accurate product specifications. Furthermore, the evaluation shows that our schema reconciliation component outperforms state-of-the-art schema matching techniques in terms of precision and recall.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Cross-Supplier Bundling of Tourist Products with Multi-Vendor Catalogs

Tourist services are one of the most popular products offered online. This has given rise to all service suppliers to offer their products over the Internet. Currently elementary services as hotels, flights or rental cars and travel packages can be booked online. Despite the high interdependence of tourist products there is no possibility for easy and user friendly online bundling of tourist pr...

متن کامل

Intelligent Electronic Catalogs for Sales Support Introducing Case-Based Reasoning Techniques to On-Line Product Selection Applications

The number of electronic catalogs has grown rapidly during the past few years. Most of these catalogs use standard databases for storing and retrieving product information. Using ordinary databases for product catalogs, however, has the major drawback that it is often very difficult to find the products desired: very often, the database does not return a matching product at all or it returns ma...

متن کامل

Analytical Product Selection Using a Highly-Dense Interface for Online Product Catalogs

One of the key elements of e-commerce systems is the online product catalog. It provides sellers with a content management system that assembles, aggregates, normalizes, and distributes product information. It also provides potential buyers with an interactive interface that offers a multimedia representation of the product information as well as retrieval, classification and ordering services....

متن کامل

Building Adaptive E-Catalog Communities Based on User Interaction Patterns

online, such as at www.dell.com or Amazon.com. Most solutions to the problem of organizing and integrating e-catalogs use a category-based hierarchy to structure a product catalog in a “one view fits all” fashion. This hierarchy is often determined by a system designer, who usually has a priori expectations of how customers will explore catalogs. However, customers might have different expectat...

متن کامل

Navigating Product Catalogs Using 2+1D Fisheye Browser

This paper introduces a novel user interface of online product catalogs. Our research focused on its capability for helping shoppers to navigate and analyze online product information. Specifically, we discuss the application of a new multi-level focus+context visualization technique, called 2+1D Fisheye Browser; in the design of interactive visual interface which can be used to assist buyers i...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • PVLDB

دوره 4  شماره 

صفحات  -

تاریخ انتشار 2011